On Approximate Nearest Neighbors in Non-Euclidean Spaces
نویسنده
چکیده
The nearest neighbor search (NNS) problem is the following: Given a set of n points P = fp 1 ; : : : ; p n g in some metric space X , preprocess P so as to eeciently answer queries which require nding a point in P closest to a query point q 2 X. The approximate nearest neighbor search (c-NNS) is a relaxation of NNS which allows to return any point within c times the distance to the nearest neighbor (called c-nearest neighbor). This problem is of major and growing importance to a variety of applications. In this paper, we give an algorithm for (4dlog 1+ log 4de + 3)-NNS algorithm in l d 1 with O(dn 1+ log n) storage and O(d log n) query time. In particular, this yields the rst algorithm for O(1)-NNS for l 1 with subexponential storage. The preprocessing time is close to linear in the size of the data structure. The algorithm can be also used (after simple modiications) to output the exact nearest neighbor in time bounded by O(d logn) plus the number of (4dlog 1+ log 4de + 3)-nearest neighbors of the query point. Building on this result, we also obtain an approximation algorithm for a general class of product metrics. Finally, we show that for any c < 3 the c-NNS problem in l 1 is prov-ably hard for a version of the indexing model introduced by Hellerstein et. al. HKP97] (our upper bound can be adapted to work in this model).
منابع مشابه
Quantitative Analysis of Nearest-Neighbors Search in High-Dimensional Sampling-Based Motion Planning
We quantitatively analyze the performance of exact and approximate nearest-neighbors algorithms on increasingly high-dimensional problems in the context of sampling-based motion planning. We study the impact of the dimension, number of samples, distance metrics, and sampling schemes on the efficiency and accuracy of nearest-neighbors algorithms. Efficiency measures computation time and accuracy...
متن کاملMetric-Based Shape Retrieval in Large Databases
This paper examines the problem of database organization and retrieval based on computing metric pairwise distances. A low-dimensional Euclidean approximation of a high-dimensional metric space is not efficient, while search in a high-dimensional Euclidean space suffers from the “curse of dimensionality”. Thus, techniques designed for searching metric spaces must be used. We evaluate several su...
متن کاملFast Approximate Nearest Neighbor Methods for Non-Euclidean Manifolds with Applications to Human Activity Analysis in Videos
Approximate Nearest Neighbor (ANN) methods such as Locality Sensitive Hashing, Semantic Hashing, and Spectral Hashing, provide computationally efficient procedures for finding objects similar to a query object in large datasets. These methods have been successfully applied to search web-scale datasets that can contain millions of images. Unfortunately, the key assumption in these procedures is ...
متن کاملHigh-Dimensional Similarity Search Using Data-Sensitive Space Partitioning
Nearest neighbor search has a wide variety of applications. Unfortunately, the majority of search methods do not scale well with dimensionality. Recent efforts have been focused on finding better approximate solutions that improve the locality of data using dimensionality reduction. However, it is possible to preserve the locality of data and find exact nearest neighbors in high dimensions with...
متن کاملApproximate nearest neighbor algorithm based on navigable small world graphs
We propose a novel approach to solving the approximate k-nearest neighbor search problem in metric spaces. The search structure is based on a navigable small world graph with vertices corresponding to the stored elements, edges to links between them, and a variation of greedy algorithm for searching. The navigable small world is created simply by keeping old Delaunay graph approximation links p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998